In this chapter, we continue our discussion on graphs and turn our attention to finding matchings in graphs. Algorithms to find matchings are used a lot in real-world applications, and we discuss some of these applications in lecture. This chapter will expand your toolkit for reasoning about graphs and help you build more intuition about them.
A matching in a graph is a subset of the edges that do not share an endpoint. A maximum matching in is a matching with the maximum number of edges among all possible matchings. A maximal matching is a matching with the property that if we add any other edge to the matching, it is no longer a matching. A perfect matching is a matching that covers all the vertices of the graph.
Consider the following graph.
Note that the empty set and a set with only one edge is always a matching. The set is a maximal matching with 2 edges, since we if we tried to add another edge to this set, it would no longer be a matching. On the other hand, this maximal matching is not a maximum matching because there is another matching with 3 edges: . This graph does not have a perfect matching. One easy way to see this is that it has an odd number of vertices, and any graph with an odd number of vertices cannot have a perfect matching.
The size of a matching refers to the number of edges in the matching, and is denoted by . Note that this coincides with the size of the set that represents.
In the maximum matching problem the input is an undirected graph and the output is a maximum matching in .
Let be a graph and let be a matching in . An augmenting path in with respect to is a path such that
the path is an alternating path, which means that the edges in the path alternate between being in and not in ,
the first and last vertices in the path are not a part of the matching .
Consider a single edge in a graph such that and are not matched by a matching (this means that vertices and do not belong to any of the edges in ). Then this edge forms an augmenting path.
Consider the following graph.
Let be the matching . Then the path is an augmenting path with respect to .
An augmenting path does not need to contain all the edges in . It is also possible that it does not contain any of the edges of . A single edge , where is not matched and is not matched, is an augmenting path.
Let be a graph. A matching is maximum if and only if there is no augmenting path in with respect to .
The statement we want to prove is equivalent to the following. Given a graph , a matching is not maximum if and only if there is an augmenting path in with respect to . There are two directions to prove.
First direction: Suppose there is an augmenting path in with respect to . Then we want to show that is not maximum. Let the augmenting path be :
The highlighted edges represent edges in . By the definition of an augmenting path, we know that and are not matched by . Since and are not matched and the path is alternating, the number of edges on this path that are in the matching is one less than the number of edges not in the matching. To see that is not a maximum matching, observe that we can obtain a bigger matching by flipping the matched and unmatched edges on the augmenting path. In other words, if an edge on the path is in the matching, we remove it from the matching, and if an edge on the path is not in the matching, we put it in the matching. This gives us a matching larger than , so is not maximum.
Second direction: We now prove the other direction. In particular, we want to show that if is not a maximum matching, then we can find an augmenting path in with respect to . Since is not a maximum, there is a matching with larger size. Let be a matching such that . We define the set to be the set of edges contained in or , but not both. That is, (this is the symmetric difference of and ). If we color the edges in with blue, and the edges in with red, then consists of edges that are colored either blue or red, but not both (i.e. no purple edges). Below is an example:
(Horizontal edges correspond to the red edges. The rest is blue.) Our goal is to find an augmenting path with respect to in (i.e., with respect to the blue edges), and once we do this, the proof will be complete.
We now proceed to find an augmenting path with respect to in . To do so, we make a couple of important observations about . First, notice that each vertex that is a part of has degree or because it can be incident to at most one edge in and at most one edge in . (If the degree was more than , either two edges from or two edges from would be intersecting, but in a matching, edges cannot intersect.) We now make two claims:
Because every vertex has degree or , consists of disjoint paths and cycles (i.e. each connected component is either a path or a cycle).
The edges in these paths and cycles alternate between blue and red.
The proof of the first claim is omitted and is left as an exercise for the reader. The second claim is true because if the edges were not alternating, i.e., if there were two red or two blue edges in a row, then this would imply the red edges or the blue edges do not form a matching (remember that in a matching no two edges can share an endpoint).
Since is a bigger matching than , we know that has more red edges than blue edges. Observe that the cycles in must have even length, because otherwise the edges cannot alternate between blue and red. Therefore the cycles have an equal number of red and blue edges. This implies that there must be a path in with more red edges than blue edges. In particular, this path starts and ends with a red edge. This path is an augmenting path with respect to (i.e., the blue edges), since it is clearly alternating between edges in and edges not in , and the endpoints are unmatched with respect to . So using the assumption that is not maximum, we were able to find an augmenting path with respect to . This completes the proof.
Let be a graph such that all vertices have degree at most . Then prove that every connected component of is either a path or a cycle (where we count an isolated vertex as a path of length 0).
Consider a graph such that all vertices have degree at most 2. We want to show that it consists of disjoint paths and cycles. We prove this by induction on the number of vertices.
Pick an arbitrary vertex in the graph. Removing results in a graph such that every vertex has degree at most 2. Since has one less vertex, by induction hypothesis, consists of disjoint paths and cycles. There are 3 cases to consider: or . It is not hard to see that in each case, adding back to the graph preserves the property that the graph is a collection of disjoint paths and cycles. (Verify this part for yourself.)
Assume, for the sake of contradiction, there are two perfect matchings and . Then as in the proof of Theorem (Characterization for maximum matchings), look at the symmetric difference between these two matchings.
The proof is by contradiction, so suppose a tree has two different perfect matchings and . As in the proof of Theorem (Characterization for maximum matchings), we’ll consider the symmetric difference between these two matchings. So let be the symmetric difference between and , i.e., . Since and , we must have (there must be an edge in that is not in , and vice versa). The set corresponds to a graph in which each vertex has degree at most . So this graph consists of disjoint paths and cycles (i.e. each connected component is either a path or a cycle). But a connected component cannot be a cycle since trees are acyclic. It also cannot be a path. This is because the existence of a degree 1 vertex in implies that this vertex is not covered/matched by either or (verify this yourself), and this would contradict the fact that and are perfect matchings covering all vertices. So must be the empty set, which contradicts our assumption that .
Below is an example of a bipartite graph.
Let be a graph. Let . A -coloring of is just a map where is a set of cardinality . (Usually the elements of are called colors. If then is a popular choice. If is large, we often just call the “colors” .) A -coloring is said to be legal for if every edge in is bichromatic, meaning that its two endpoints have different colors. (I.e., for all it is required that .) Finally, we say that is -colorable if it has a legal -coloring.
The graph below is -colorable. We can color the vertex at the center green, and color the outer vertices with blue and red by alternating those two colors.
A graph is bipartite if and only if it is -colorable. The -coloring corresponds to partitioning the vertex set into and such that all the edges have one endpoint in and the other in .
A graph is bipartite if and only if it contains no odd-length cycles.
There are two directions to prove.
(): For this direction, we want to show that if a graph is bipartite, then it contains no odd-length cycles. We prove the contrapositive. Observe that it is impossible to -color an odd-length cycle. So if a graph contains an odd-length cycle, the graph cannot be -colored, and therefore cannot be bipartite.
(): For this direction, we want to show that if a graph does not contain an odd-length cycle, then it is bipartite. So suppose the graph contains no cycles of odd length. Without loss of generality, assume the graph is connected (if it is not, we can apply the argument to each connected component separately). For , let denote the length of the shortest path from to (or from to ). Pick a starting vertex/root and consider the “BFS tree” rooted at . In this tree, level 0 corresponds to , and level corresponds to all vertices with . Color odd-indexed levels blue, and color even-indexed levels red.
The proof is done once we show that this is a valid -coloring of the graph. To show this, we’ll argue that no edge has its endpoints colored the same color. There are two types of edges we need to worry about that could potentially have its endpoints colored the same color. We consider each type below.
First, there could potentially be an edge between two vertices and at the same level. Let’s assume such an edge exists. Let be the lowest common ancestor of and . Note that , so the path from to , plus the path from to , plus the edge , form an odd-length cycle. This is a contradiction.
Second, we need to consider the possibility that there is an edge between a vertex at level and a vertex at level for some . However, the existence of such an edge implies that , which contradicts the fact that is at level . So this type of edge cannot exist either. This completes the proof.
There is a polynomial time algorithm to solve the maximum matching problem in bipartite graphs.
Let be the input graph. The high level steps of the algorithm is as follows.
Let where is an arbitrary edge.
Repeat until there is no augmenting path with respect to :
Find an augmenting path with respect to .
Update according to the augmenting path (swapping matched and unmatched edges along the path).
Every time we find an augmenting path, we increase the size of our matching by one. When there are no more augmenting paths, we stop and correctly output a maximum matching (the correctness follows from Theorem (Characterization for maximum matchings)). The only unclear step of the algorithm is finding an augmenting path with respect to . And we explain how to do this step below. But before we do that, note that if this step can be done in polynomial time, then the overall running time of the algorithm is polynomial time since the loop repeats times and the work done in each iteration is polynomial time.
We now show how to find an augmenting path (given and ). We first turn into a directed graph as follows:
Direct edges in from to .
Direct edges in from to .
Note that an alternating path with respect to in corresponds to a directed path in . (We leave it to the reader to verify this.) This means that there is an augmenting path with respect to in if and only if there is a directed path in from an unmatched vertex in to an unmatched vertex in . So to find an augmenting path, we’ll search for a directed path in from an unmatched to an unmatched , as follows:
For each unmatched :
Run DFS(, ).
If you hit an unmatched , output the path from to .
Output “no augmenting path found.”
The running time is polynomial time since the loop repeats at most times, and the work done in each iteration is polynomial time.
The high-level algorithm above presented in the proof of Theorem (Finding a maximum matching in bipartite graphs) is in fact applicable to general (not necessarily bipartite) graphs. However, the step of finding an augmenting path with respect to a matching turns out to be much more involved, and therefore we do not cover it in this chapter. See https://en.wikipedia.org/wiki/Blossom_algorithm if you would like to learn more.
Let be a bipartite graph. For a subset of the vertices, let . Then has a matching covering all the vertices in if and only if for all , we have .